mposed of 54 Versicolor flowers and five Virginica flowers.
e based on this node (in fact a leaf), the probability of classifying
to be the Versicolor class was 91% and to be the Virginica class
The other subspace was composed 46 Virginica flowers and one
or flower. On this leaf node, the probability of classifying a flower
Virginica class was 98% and to be the Versicolor class was 2%.
0 100 setosa (0.33 0.33 0.33)
l.Length< 2.45 50 0 setosa (1 0 0) *
l.Length>=2.45 100 50 versicolor (0 0.5 0.5)
etal.Width< 1.75 54 5 versicolor (0 0.91 0.09) *
etal.Width>=1.75 46 1 virginica (0 0.02 0.98) *
he C50 algorithm
est version of decision tree was Iterative Dichotomiser 3 (ID3),
as developed by Ross Quinlan [Quinlan, 1986]. ID3 was updated
[Quinlan, 1993]. C5.0 (R package C50) is the most recently
version. The basic measurement used in this algorithm for
ly partitioning a data space is the information gain. One of the
ures of C50 is its wonderful graphical presentation. The R
of C50 is shown below
C5.0(formula,data,trials)
nstance, Figure 3.41 shows an example of a C50 tree generated
is data.
Fig. 3.41. An unpruned C50 tree constructed for the Iris data.